Skip to content

Conversation

@zeyi2
Copy link
Member

@zeyi2 zeyi2 commented Nov 2, 2025

The patch implements clang-tidy-alphabetical-order-check.py, a small Python tool to normalize clang-tidy docs with deterministic ordering.

It currently checks these two files:

Closes #166021

@zeyi2 zeyi2 changed the title [clang-tidy][docs] Implement alphabetical order test [clang-tidy] Implement alphabetical order test Nov 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 2, 2025

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang-tools-extra

Author: mitchell (zeyi2)

Changes

The patch implements clang-tidy-alphabetical-order-check.py, a small Python tool to normalize clang-tidy docs with deterministic ordering.

It currently checks these two files:

Closes #166021


Full diff: https://github.com/llvm/llvm-project/pull/166072.diff

3 Files Affected:

  • (added) clang-tools-extra/clang-tidy/tool/clang-tidy-alphabetical-order-check.py (+301)
  • (added) clang-tools-extra/test/clang-tidy/infrastructure/alphabetical-order.cpp (+6)
  • (modified) clang-tools-extra/test/lit.cfg.py (+1)
diff --git a/clang-tools-extra/clang-tidy/tool/clang-tidy-alphabetical-order-check.py b/clang-tools-extra/clang-tidy/tool/clang-tidy-alphabetical-order-check.py
new file mode 100644
index 0000000000000..321663bb7d577
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/tool/clang-tidy-alphabetical-order-check.py
@@ -0,0 +1,301 @@
+#!/usr/bin/env python3
+
+"""
+Normalize clang-tidy docs with deterministic sorting for linting/tests.
+
+Subcommands:
+  - checks-list: Sort entries in docs/clang-tidy/checks/list.rst csv-table.
+  - release-notes: Sort key sections in docs/ReleaseNotes.rst and de-duplicate
+                   entries in "Changes in existing checks".
+
+Usage:
+  clang-tidy-alphabetical-order-check.py <subcommand> [-i <input rst>] [-o <output rst>] [--fix]
+
+Flags:
+  -i/--input   Input file.
+  -o/--output  Write normalized content here; omit to write to stdout.
+  --fix        Rewrite the input file in place. Cannot be combined with -o/--output.
+"""
+
+import argparse
+import io
+import os
+import re
+import sys
+from typing import List, Optional, Sequence, Tuple
+
+DOC_LABEL_RN_RE = re.compile(r":doc:`(?P<label>[^`<]+)\s*(?:<[^>]+>)?`")
+DOC_LINE_RE = re.compile(r"^\s*:doc:`(?P<label>[^`<]+?)\s*<[^>]+>`.*$")
+
+
+def script_dir() -> str:
+    return os.path.dirname(os.path.abspath(__file__))
+
+
+def read_text(path: str) -> List[str]:
+    with io.open(path, "r", encoding="utf-8") as f:
+        return f.read().splitlines(True)
+
+
+def write_text(path: str, content: str) -> None:
+    with io.open(path, "w", encoding="utf-8", newline="") as f:
+        f.write(content)
+
+
+def normalize_list_rst(lines: List[str]) -> str:
+    out: List[str] = []
+    i = 0
+    n = len(lines)
+    while i < n:
+        out.append(lines[i])
+        if lines[i].lstrip().startswith(".. csv-table::"):
+            i += 1
+            break
+        i += 1
+
+    while i < n and (lines[i].startswith(" ") or lines[i].strip() == ""):
+        if DOC_LINE_RE.match(lines[i]):
+            break
+        out.append(lines[i])
+        i += 1
+
+    entries: List[str] = []
+    while i < n and lines[i].startswith(" "):
+        if DOC_LINE_RE.match(lines[i]):
+            entries.append(lines[i])
+        else:
+            entries.append(lines[i])
+        i += 1
+
+    def key_for(line: str):
+        m = DOC_LINE_RE.match(line)
+        if not m:
+            return (1, "")
+        return (0, m.group("label"))
+
+    entries_sorted = sorted(entries, key=key_for)
+    out.extend(entries_sorted)
+    out.extend(lines[i:])
+
+    return "".join(out)
+
+
+def run_checks_list(
+    inp: Optional[str], out_path: Optional[str], fix: bool
+) -> int:
+    if not inp:
+        inp = os.path.normpath(
+            os.path.join(
+                script_dir(),
+                "..",
+                "..",
+                "docs",
+                "clang-tidy",
+                "checks",
+                "list.rst",
+            )
+        )
+    lines = read_text(inp)
+    normalized = normalize_list_rst(lines)
+    if fix and out_path:
+        sys.stderr.write("error: --fix cannot be used together with --output\n")
+        return 2
+    if fix:
+        original = "".join(lines)
+        if original != normalized:
+            write_text(inp, normalized)
+        return 0
+    if out_path:
+        write_text(out_path, normalized)
+        return 0
+    sys.stdout.write(normalized)
+    return 0
+
+
+def find_heading(lines: Sequence[str], title: str) -> Optional[int]:
+    for i in range(len(lines) - 1):
+        if lines[i].rstrip("\n") == title:
+            underline = lines[i + 1].rstrip("\n")
+            if (
+                underline
+                and set(underline) == {"^"}
+                and len(underline) >= len(title)
+            ):
+                return i
+    return None
+
+
+def extract_label(text: str) -> str:
+    m = DOC_LABEL_RN_RE.search(text)
+    return m.group("label") if m else text
+
+
+def is_bullet_start(line: str) -> bool:
+    return line.startswith("- ")
+
+
+def collect_bullet_blocks(
+    lines: Sequence[str], start: int, end: int
+) -> Tuple[List[str], List[Tuple[str, List[str]]], List[str]]:
+    i = start
+    n = end
+    first_bullet = i
+    while first_bullet < n and not is_bullet_start(lines[first_bullet]):
+        first_bullet += 1
+    prefix = list(lines[i:first_bullet])
+
+    blocks: List[Tuple[str, List[str]]] = []
+    i = first_bullet
+    while i < n:
+        if not is_bullet_start(lines[i]):
+            break
+        bstart = i
+        i += 1
+        while i < n and not is_bullet_start(lines[i]):
+            if (
+                i + 1 < n
+                and set(lines[i + 1].rstrip("\n")) == {"^"}
+                and lines[i].strip()
+            ):
+                break
+            i += 1
+        block = list(lines[bstart:i])
+        key = extract_label(block[0])
+        blocks.append((key, block))
+
+    suffix = list(lines[i:n])
+    return prefix, blocks, suffix
+
+
+def sort_and_dedup_blocks(
+    blocks: List[Tuple[str, List[str]]], dedup: bool = False
+) -> List[List[str]]:
+    seen = set()
+    filtered: List[Tuple[str, List[str]]] = []
+    for key, block in blocks:
+        if dedup:
+            if key in seen:
+                continue
+            seen.add(key)
+        filtered.append((key, block))
+    filtered.sort(key=lambda kb: kb[0])
+    return [b for _, b in filtered]
+
+
+def normalize_release_notes(lines: List[str]) -> str:
+    sections = [
+        ("New checks", False),
+        ("New check aliases", False),
+        ("Changes in existing checks", True),
+    ]
+
+    out = list(lines)
+
+    for idx in range(len(sections) - 1, -1, -1):
+        title, dedup = sections[idx]
+        h_start = find_heading(out, title)
+
+        if h_start is None:
+            continue
+
+        sec_start = h_start + 2
+
+        if idx + 1 < len(sections):
+            next_title = sections[idx + 1][0]
+            h_end = find_heading(out, next_title)
+            if h_end is None:
+                h_end = sec_start
+                while h_end + 1 < len(out):
+                    if out[h_end].strip() and set(
+                        out[h_end + 1].rstrip("\n")
+                    ) == {"^"}:
+                        break
+                    h_end += 1
+            sec_end = h_end
+        else:
+            h_end = sec_start
+            while h_end + 1 < len(out):
+                if out[h_end].strip() and set(out[h_end + 1].rstrip("\n")) == {
+                    "^"
+                }:
+                    break
+                h_end += 1
+            sec_end = h_end
+
+        prefix, blocks, suffix = collect_bullet_blocks(out, sec_start, sec_end)
+        sorted_blocks = sort_and_dedup_blocks(blocks, dedup=dedup)
+
+        new_section: List[str] = []
+        new_section.extend(prefix)
+        for i_b, b in enumerate(sorted_blocks):
+            if i_b > 0 and (
+                not new_section
+                or (new_section and new_section[-1].strip() != "")
+            ):
+                new_section.append("\n")
+            new_section.extend(b)
+        new_section.extend(suffix)
+
+        out = out[:sec_start] + new_section + out[sec_end:]
+
+    return "".join(out)
+
+
+def run_release_notes(
+    inp: Optional[str], out_path: Optional[str], fix: bool
+) -> int:
+    if not inp:
+        inp = os.path.normpath(
+            os.path.join(script_dir(), "..", "..", "docs", "ReleaseNotes.rst")
+        )
+    lines = read_text(inp)
+    normalized = normalize_release_notes(lines)
+    if fix and out_path:
+        sys.stderr.write("error: --fix cannot be used together with --output\n")
+        return 2
+    if fix:
+        original = "".join(lines)
+        if original != normalized:
+            write_text(inp, normalized)
+        return 0
+    if out_path:
+        write_text(out_path, normalized)
+        return 0
+    sys.stdout.write(normalized)
+    return 0
+
+
+def main(argv: List[str]) -> int:
+    ap = argparse.ArgumentParser()
+    sub = ap.add_subparsers(dest="cmd", required=True)
+
+    ap_checks = sub.add_parser(
+        "checks-list", help="normalize clang-tidy checks list.rst"
+    )
+    ap_checks.add_argument("-i", "--input", dest="inp", default=None)
+    ap_checks.add_argument("-o", "--output", dest="out", default=None)
+    ap_checks.add_argument(
+        "--fix", action="store_true", help="rewrite the input file in place"
+    )
+
+    ap_rn = sub.add_parser(
+        "release-notes", help="normalize ReleaseNotes.rst sections"
+    )
+    ap_rn.add_argument("-i", "--input", dest="inp", default=None)
+    ap_rn.add_argument("-o", "--output", dest="out", default=None)
+    ap_rn.add_argument(
+        "--fix", action="store_true", help="rewrite the input file in place"
+    )
+
+    args = ap.parse_args(argv)
+
+    if args.cmd == "checks-list":
+        return run_checks_list(args.inp, args.out, args.fix)
+    if args.cmd == "release-notes":
+        return run_release_notes(args.inp, args.out, args.fix)
+
+    ap.error("unknown command")
+
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/clang-tools-extra/test/clang-tidy/infrastructure/alphabetical-order.cpp b/clang-tools-extra/test/clang-tidy/infrastructure/alphabetical-order.cpp
new file mode 100644
index 0000000000000..4a2598b93942b
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/infrastructure/alphabetical-order.cpp
@@ -0,0 +1,6 @@
+// RUN: %python %S/../../../clang-tidy/tool/clang-tidy-alphabetical-order-check.py checks-list -i %S/../../../docs/clang-tidy/checks/list.rst -o %t.list
+// RUN: diff --strip-trailing-cr %t.list \
+// RUN:   %S/../../../docs/clang-tidy/checks/list.rst
+
+// RUN: %python %S/../../../clang-tidy/tool/clang-tidy-alphabetical-order-check.py release-notes -i %S/../../../docs/ReleaseNotes.rst -o %t.rn
+// RUN: diff --strip-trailing-cr %t.rn %S/../../../docs/ReleaseNotes.rst
diff --git a/clang-tools-extra/test/lit.cfg.py b/clang-tools-extra/test/lit.cfg.py
index c1da37d61bd61..c39ea29329674 100644
--- a/clang-tools-extra/test/lit.cfg.py
+++ b/clang-tools-extra/test/lit.cfg.py
@@ -57,6 +57,7 @@
 if config.clang_tidy_custom_check:
     config.available_features.add("custom-check")
 python_exec = shlex.quote(config.python_executable)
+config.substitutions.append(("%python", python_exec))
 check_clang_tidy = os.path.join(
     config.test_source_root, "clang-tidy", "check_clang_tidy.py"
 )

@github-actions
Copy link

github-actions bot commented Nov 2, 2025

✅ With the latest revision this PR passed the Python code formatter.

@zeyi2
Copy link
Member Author

zeyi2 commented Nov 2, 2025

Currently, the deduplication process has issues. For example:

Changes in existing checks
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Improved :doc:`bugprone-easily-swappable-parameters
  <clang-tidy/checks/bugprone/easily-swappable-parameters>` check by
  correcting a spelling mistake on its option
  ``NamePrefixSuffixSilenceDissimilarityTreshold``.

- Improved :doc:`bugprone-easily-swappable-parameters
  <clang-tidy/checks/bugprone/easily-swappable-parameters>` check
  this is just an example test.

the check will hint:

# executed command: diff --strip-trailing-cr - /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../docs/ReleaseNotes.rst
# .---command stdout------------
# | *** -
# | --- /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../docs/ReleaseNotes.rst
# | ***************
# | *** 272,277 ****
# | --- 272,281 ----
# |     correcting a spelling mistake on its option
# |     ``NamePrefixSuffixSilenceDissimilarityTreshold``.
# |   
# | + - Improved :doc:`bugprone-easily-swappable-parameters
# | +   <clang-tidy/checks/bugprone/easily-swappable-parameters>` check
# | +   this is just an example test.
# | + 
# |   - Improved :doc:`bugprone-exception-escape
# |     <clang-tidy/checks/bugprone/exception-escape>` check's handling of lambdas:
# |     exceptions from captures are now diagnosed, exceptions in the bodies of
# `-----------------------------
# error: command failed with exit status: 1

The intent is to surface duplicates so developers can merge them manually. However, when run with the --fix flag, the tool deletes the later duplicate entry outright, leaving the file as:

Changes in existing checks
^^^^^^^^^^^^^^^^^^^^^^^^^^

- Improved :doc:`bugprone-easily-swappable-parameters
  <clang-tidy/checks/bugprone/easily-swappable-parameters>` check by
  correcting a spelling mistake on its option
  ``NamePrefixSuffixSilenceDissimilarityTreshold``.

This causes unintended content loss.

I'm not sure how to deal with this, perhaps emits a warning instead of deleting entries?

Any suggestions are welcome.

@vbvictor
Copy link
Contributor

vbvictor commented Nov 2, 2025

The intent is to surface duplicates so developers can merge them manually. However, when run with the --fix flag, the tool deletes the later duplicate entry outright, leaving the file as:

Could we make the script do nothing and don't delete duplicated entries? Just sort them in alphabetical order and let the user fix it by hand.

Copy link
Contributor

@vbvictor vbvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script is hard to review at this form with, could we make functions smaller, especially normalize_release_notes.

Comment on lines 18 to 29
Subcommands:
- checks-list: Sort entries in docs/clang-tidy/checks/list.rst csv-table.
- release-notes: Sort key sections in docs/ReleaseNotes.rst and de-duplicate
entries in "Changes in existing checks".

Usage:
clang-tidy-alphabetical-order-check.py <subcommand> [-i <input rst>] [-o <output rst>] [--fix]

Flags:
-i/--input Input file.
-o/--output Write normalized content here; omit to write to stdout.
--fix Rewrite the input file in place. Cannot be combined with -o/--output.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need any flags and subcommands expect -o flag (for testing reasons).
Let it always try to fix by default.
Since we know that we are in LLVM repo, just locate the needed files.
See how https://github.com/llvm/llvm-project/blob/main/clang/docs/tools/dump_format_style.py is implemented, our script structure should be similar to it.

f.write(content)


def normalize_list_rst(lines: List[str]) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment what input/output the function has

return 0


def find_heading(lines: Sequence[str], title: str) -> Optional[int]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment what heading styly do we look for

@zeyi2
Copy link
Member Author

zeyi2 commented Nov 3, 2025

Sorry for the inconvenience I brought, I've updated the python script and it would be easier to review now. Thank you for your patience.

@zeyi2 zeyi2 requested a review from vbvictor November 3, 2025 04:18
@zeyi2
Copy link
Member Author

zeyi2 commented Nov 3, 2025

With the latest bugfix, the CI is expected to fail on docs/clang-tidy/checks/list.rst (see PR #166123), but it didn’t.

I can reproduce the expected failure locally (logs below):

[2540/2541] Running clang_tools regression tests
FAIL: Clang Tools :: clang-tidy/infrastructure/alphabetical-order.cpp (1 of 1301)
******************** TEST 'Clang Tools :: clang-tidy/infrastructure/alphabetical-order.cpp' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/usr/bin/python3.13 /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../clang-tidy/tool/clang-tidy-alphabetical-order-check.py -o /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/test/clang-tidy/infrastructure/Output/alphabetical-order.cpp.tmp.clang-tidy-checks-list.rst
# executed command: /usr/bin/python3.13 /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../clang-tidy/tool/clang-tidy-alphabetical-order-check.py -o /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/test/clang-tidy/infrastructure/Output/alphabetical-order.cpp.tmp.clang-tidy-checks-list.rst
# RUN: at line 2
diff --strip-trailing-cr /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/test/clang-tidy/infrastructure/Output/alphabetical-order.cpp.tmp.clang-tidy-checks-list.rst /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../docs/clang-tidy/checks/list.rst
# executed command: diff --strip-trailing-cr /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/test/clang-tidy/infrastructure/Output/alphabetical-order.cpp.tmp.clang-tidy-checks-list.rst /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../docs/clang-tidy/checks/list.rst
# .---command stdout------------
# | *** /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/test/clang-tidy/infrastructure/Output/alphabetical-order.cpp.tmp.clang-tidy-checks-list.rst
# | --- /home/mitchell/Documents/projects/llvm-project/clang-tools-extra/test/clang-tidy/infrastructure/../../../docs/clang-tidy/checks/list.rst
# | ***************
# | *** 442,449 ****
# |      :doc:`cert-dcl51-cpp <cert/dcl51-cpp>`, :doc:`bugprone-reserved-identifier <bugprone/reserved-identifier>`, "Yes"
# |      :doc:`cert-dcl54-cpp <cert/dcl54-cpp>`, :doc:`misc-new-delete-overloads <misc/new-delete-overloads>`,
# |      :doc:`cert-dcl59-cpp <cert/dcl59-cpp>`, :doc:`google-build-namespaces <google/build-namespaces>`,
# |      :doc:`cert-env33-c <cert/env33-c>`, :doc:`bugprone-command-processor <bugprone/command-processor>`,
# | -    :doc:`cert-err09-cpp <cert/err09-cpp>`, :doc:`misc-throw-by-value-catch-by-reference <misc/throw-by-value-catch-by-reference>`,
# |      :doc:`cert-err34-c <cert/err34-c>`, :doc:`bugprone-unchecked-string-to-number-conversion <bugprone/unchecked-string-to-number-conversion>`,
# |      :doc:`cert-err52-cpp <cert/err52-cpp>`, :doc:`modernize-avoid-setjmp-longjmp <modernize/avoid-setjmp-longjmp>`,
# |      :doc:`cert-err58-cpp <cert/err58-cpp>`, :doc:`bugprone-throwing-static-initialization <bugprone/throwing-static-initialization>`,
# | --- 442,449 ----
# |      :doc:`cert-dcl51-cpp <cert/dcl51-cpp>`, :doc:`bugprone-reserved-identifier <bugprone/reserved-identifier>`, "Yes"
# |      :doc:`cert-dcl54-cpp <cert/dcl54-cpp>`, :doc:`misc-new-delete-overloads <misc/new-delete-overloads>`,
# |      :doc:`cert-dcl59-cpp <cert/dcl59-cpp>`, :doc:`google-build-namespaces <google/build-namespaces>`,
# | +    :doc:`cert-err09-cpp <cert/err09-cpp>`, :doc:`misc-throw-by-value-catch-by-reference <misc/throw-by-value-catch-by-reference>`,
# |      :doc:`cert-env33-c <cert/env33-c>`, :doc:`bugprone-command-processor <bugprone/command-processor>`,
# |      :doc:`cert-err34-c <cert/err34-c>`, :doc:`bugprone-unchecked-string-to-number-conversion <bugprone/unchecked-string-to-number-conversion>`,
# |      :doc:`cert-err52-cpp <cert/err52-cpp>`, :doc:`modernize-avoid-setjmp-longjmp <modernize/avoid-setjmp-longjmp>`,
# |      :doc:`cert-err58-cpp <cert/err58-cpp>`, :doc:`bugprone-throwing-static-initialization <bugprone/throwing-static-initialization>`,
# `-----------------------------
# error: command failed with exit status: 1

--

********************
********************
Failed Tests (1):
  Clang Tools :: clang-tidy/infrastructure/alphabetical-order.cpp


Testing Time: 42.01s

Total Discovered Tests: 3055
  Unsupported      :    7 (0.23%)
  Passed           : 3046 (99.71%)
  Expectedly Failed:    1 (0.03%)
  Failed           :    1 (0.03%)
FAILED: [code=1] tools/clang/tools/extra/CMakeFiles/check-clang-tools /home/mitchell/Documents/projects/llvm-project/build/tools/clang/tools/extra/CMakeFiles/check-clang-tools 

Not quite sure why CI didn't give the same result :(

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be good idea to add some small tests for script itself.


"""

ClangTidy Alphabetical Order Checker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ClangTidy Alphabetical Order Checker
Clang-Tidy Alphabetical Order Checker

ClangTidy Alphabetical Order Checker
====================================

Normalize clang-tidy docs with deterministic sorting for linting/tests.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Normalize clang-tidy docs with deterministic sorting for linting/tests.
Normalize Clang-Tidy documentation with deterministic sorting for linting/tests.

f.write(content)


def normalize_list_rst(lines: List[str]) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still support Python versions that does not support list in type annotation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK we require python 3.8 for llvm and list is for 3.9+

There was rfc on discourse to bump the minimum python version but it got pushed back

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like support for 3.8 should be dropped, even 3.9 had reached end of life: https://www.python.org/downloads/. Sure, this is LLVM-wide topic and should be decided by wider audience.

Copy link
Contributor

@vbvictor vbvictor Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is topic about it: https://discourse.llvm.org/t/rfc-upgrading-llvm-s-minimum-required-python-version/88605.

And it seems that we would still keep 3.8 as minimal version for some time in the future despite it being EOL..


Behavior:
- Sort entries in docs/clang-tidy/checks/list.rst csv-table.
- Sort key sections in docs/ReleaseNotes.rst. Does not remove duplicate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will warning for duplicated section be issued?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently no. But I'm working on this feature.

@@ -0,0 +1,5 @@
// RUN: %python %S/../../../clang-tidy/tool/clang-tidy-alphabetical-order-check.py -o %t.clang-tidy-checks-list.rst
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be called in CMake.

Copy link
Contributor

@vbvictor vbvictor Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be in CMake, could you elaborate?
We already have similar tests, like this https://github.com/llvm/llvm-project/blob/main/clang/test/Format/docs_updated.test to ensure that docs are updated.
What would be the workflow involving CMake?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, CMake should only be used as project-configuration step, it shouldn't run any tests beside some library checking.

@zeyi2
Copy link
Member Author

zeyi2 commented Nov 4, 2025

Hi, I added a duplicate check to detect duplicates in the "Changes in existing checks" section of ReleaseNotes.
The script currently checks ordering first. Only when ordering is correct does it scan for duplicates and fail with exit
code 3 (maybe we need to choose a different exit code).

The current duplicate report looks like this:

# .---command stderr------------
# | Error: Duplicate entries in 'Changes in existing checks':
# | 
# | -- Duplicate: - Improved :doc:`bugprone-easily-swappable-parameters
# | 
# | - At line 280:
# | - Improved :doc:`bugprone-easily-swappable-parameters
# |   <clang-tidy/checks/bugprone/easily-swappable-parameters>` check by
# |   correcting a spelling mistake on its option
# |   ``NamePrefixSuffixSilenceDissimilarityTreshold``.
# | 
# | - At line 285:
# | - Improved :doc:`bugprone-easily-swappable-parameters
# |   <clang-tidy/checks/bugprone/easily-swappable-parameters>` check by
# |   correcting a spelling mistake on its option
# |   ``NamePrefixSuffixSilenceDissimilarityTreshold``.
# | 
# `-----------------------------
# error: command failed with exit status: 3
--

I'm still working on the tests of this script. Meanwhile, feedbacks of any kind are appreciated.

@vbvictor
Copy link
Contributor

vbvictor commented Nov 4, 2025

As for the test, it could be a simple xxx_test.py script, look at how tests in https://github.com/llvm/llvm-project/tree/main/.ci are organized. AFAIK they are not run in premerge CI, but for small tests I guess it's not needed that much.

As for the script name, we should remove "clang-tidy" part because it duplicates a part presented in filepath, so It should be check-alphabetical-order.py and check-alphabetical-order_test.py (reordered words a bit).

@zeyi2
Copy link
Member Author

zeyi2 commented Nov 6, 2025

Hi, I've updated the testcases and remove all List[] in function declaration. Now the checker can work with python 3.8.

Comment on lines 18 to 29
def _load_script_module():
here = os.path.dirname(cast(str, __file__))
script_path = os.path.normpath(os.path.join(here, "check-alphabetical-order.py"))
loader = importlib.machinery.SourceFileLoader(
"check_alphabetical_order", cast(str, script_path)
)
spec = importlib.util.spec_from_loader(loader.name, loader)
if spec is None or spec.loader is None:
raise ImportError(f"Failed to load spec for {script_path}")
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all needed because you can't import --delimeted script name?
Lets rename it check_alphabetical_order.py and then just import check_alphabetical_order

Comment on lines 52 to 59
pos_abseil = out.find("abseil-cleanup-ctad")
pos_bugprone = out.find("bugprone-virtual-near-miss")
pos_cert = out.find("cert-flp30-c")
self.assertTrue(all(p != -1 for p in [pos_abseil, pos_bugprone, pos_cert]))
self.assertLess(pos_abseil, pos_bugprone)
self.assertLess(pos_bugprone, pos_cert)
# Non-doc row should remain after doc rows within the table region.
self.assertGreater(out.find("A non-doc row"), out.find("cert-flp30-c"))
Copy link
Contributor

@vbvictor vbvictor Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertTrue, assertLess is hard to understand, could we instead match the text as a whole?
E.g. given input

".. csv-table:: Clang-Tidy checks\n",
'   :header: "Name", "Offers fixes"\n',
"\n",
'   :doc:`bugprone-virtual-near-miss <bugprone/virtual-near-miss>`, "Yes"\n',
"   :doc:`cert-flp30-c <cert/flp30-c>`,\n",
'   :doc:`abseil-cleanup-ctad <abseil/cleanup-ctad>`, "Yes"\n',
"   A non-doc row that should stay after docs\n",

We expect output

".. csv-table:: Clang-Tidy checks\n",
'   :header: "Name", "Offers fixes"\n',
"\n",
'   :doc:`abseil-cleanup-ctad <abseil/cleanup-ctad>`, "Yes"\n',
'   :doc:`bugprone-virtual-near-miss <bugprone/virtual-near-miss>`, "Yes"\n',
"   :doc:`cert-flp30-c <cert/flp30-c>`,\n",
"   A non-doc row that should stay after docs\n",

And just compare two big strings or List[str].

It will be good to make normalize_list_rst function either accept list[str] and return list[str] or accept str and return str.
It will be easy to test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make other tests follow this idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make other tests follow this idea.

I'm working on this, will request another review when finished!

Comment on lines 107 to 115
self.assertIsInstance(report, str)
self.assertIn("Duplicate entries in 'Changes in existing checks':", report)
self.assertIn(
"-- Duplicate: - Improved :doc:`bugprone-easily-swappable-parameters",
report,
)
self.assertEqual(report.count("- At line "), 2)
self.assertIn("- At line 4:", report)
self.assertIn("- At line 14:", report)
Copy link
Contributor

@vbvictor vbvictor Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also, match report as a whole chunk of text (same in other tests below)

@zeyi2 zeyi2 requested a review from vbvictor November 9, 2025 12:49
@zeyi2
Copy link
Member Author

zeyi2 commented Nov 9, 2025

Hi, I've updated the testcases as requested.

Comment on lines 112 to 122
def normalize_list_rst(data: Union[str, List[str]]) -> Union[str, List[str]]:
"""Normalize list.rst; returns same type as input (str or list).

- If given a string, returns a single normalized string.
- If given a sequence of lines, returns a list of lines.
"""
if isinstance(data, str):
lines = data.splitlines(True)
return "".join(_normalize_list_rst_lines(lines))
else:
return _normalize_list_rst_lines(data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to support both str and List[str]? Can we stick to only one way of processing and use it everywhere. IMO, there is no point in writing complex code.

If we really need to pass raw str, then just do data.splitlines(True) on caller side, and we don't have to write 20 lines of additional code here.

i = 0
n = len(lines)

def key_for(line: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"key" meaning check name? Let's call it check_name for better context


def parse_bullet_blocks(
lines: Sequence[str], start: int, end: int
) -> Tuple[List[str], List[Tuple[str, List[str]]], List[str]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By looking at Tuple[List[str], List[Tuple[str, List[str]]], List[str]] I have no idea what to expect as an output. Can we give a comment what each part of the tuple means. Would be good to give proper name for each Tuple part with type alias.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those alias types could be reused later too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NamedTuple or @dataclass are better ways to express such things.

Copy link
Contributor

@vbvictor vbvictor Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm up to it.
I'm kinda bad at python TBH:(

return line.startswith("- ")


def parse_bullet_blocks(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is "private" function, start with _

Comment on lines 213 to 226
while i < n:
if not is_bullet_start(lines[i]):
break
bstart = i
i += 1
while i < n and not is_bullet_start(lines[i]):
if (
i + 1 < n
and set(lines[i + 1].rstrip("\n")) == {"^"}
and lines[i].strip()
):
break
i += 1
block = list(lines[bstart:i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code looks identical to one in parse_bullet_blocks, could we refactor it to separate function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least some part of it

Comment on lines 377 to 390
list_lines = read_text(list_doc)
rn_lines = read_text(rn_doc)
list_norm = normalize_list_rst("".join(list_lines))
rn_norm = normalize_release_notes(rn_lines)
if "".join(list_lines) != list_norm:
write_text(list_doc, list_norm)
if "".join(rn_lines) != rn_norm:
write_text(rn_doc, rn_norm)

report = _emit_duplicate_report(rn_lines, "Changes in existing checks")
if report:
sys.stderr.write(report)
return 3
return 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we here reuse process_release_notes and process_checks_list instead of custom logic?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report = _emit_duplicate_report(rn_lines, "Changes in existing checks")
    if report:
        sys.stderr.write(report)
        return 3
    return 0

It is duplicated here and in process_release_notes.

def test_duplicate_detection_and_report(self):
# Ensure duplicate detection works properly when sorting is incorrect.
lines = [
"Changes in existing checks\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test that we handle correctly:

- Improved :doc:`llvm-prefer-isa-or-dyn-cast-in-conditionals
  <clang-tidy/checks/llvm/prefer-isa-or-dyn-cast-in-conditionals>` check:

  - Fix-it handles callees with nested-name-specifier correctly.

  - ``if`` statements with init-statement (``if (auto X = ...; ...)``) are
    handled correctly.

  - ``for`` loops are supported.

With nested - inside text

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new testcase for this, but I don't think the script can reorder the inner list for now. Do we need to handle that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, inner list should be left as is.
We don't often have them and IMO there is no point in ordering

@github-actions
Copy link

github-actions bot commented Nov 19, 2025

🐧 Linux x64 Test Results

  • 3053 tests passed
  • 7 tests skipped

@zeyi2
Copy link
Member Author

zeyi2 commented Nov 19, 2025

Sorry that I introduced the issue that causes the CI failure in #167689, fixing it right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[clang-tidy][docs] Implement alphabetical validation of clang-tidy docs.

5 participants